6 research outputs found
Primitive Skill-based Robot Learning from Human Evaluative Feedback
Reinforcement learning (RL) algorithms face significant challenges when
dealing with long-horizon robot manipulation tasks in real-world environments
due to sample inefficiency and safety issues. To overcome these challenges, we
propose a novel framework, SEED, which leverages two approaches: reinforcement
learning from human feedback (RLHF) and primitive skill-based reinforcement
learning. Both approaches are particularly effective in addressing sparse
reward issues and the complexities involved in long-horizon tasks. By combining
them, SEED reduces the human effort required in RLHF and increases safety in
training robot manipulation with RL in real-world settings. Additionally,
parameterized skills provide a clear view of the agent's high-level intentions,
allowing humans to evaluate skill choices before they are executed. This
feature makes the training process even safer and more efficient. To evaluate
the performance of SEED, we conducted extensive experiments on five
manipulation tasks with varying levels of complexity. Our results show that
SEED significantly outperforms state-of-the-art RL algorithms in sample
efficiency and safety. In addition, SEED also exhibits a substantial reduction
of human effort compared to other RLHF methods. Further details and video
results can be found at https://seediros23.github.io/
Decentralized Vehicle Coordination: The Berkeley DeepDrive Drone Dataset
Decentralized multiagent planning has been an important field of research in
robotics. An interesting and impactful application in the field is
decentralized vehicle coordination in understructured road environments. For
example, in an intersection, it is useful yet difficult to deconflict multiple
vehicles of intersecting paths in absence of a central coordinator. We learn
from common sense that, for a vehicle to navigate through such understructured
environments, the driver must understand and conform to the implicit "social
etiquette" observed by nearby drivers. To study this implicit driving protocol,
we collect the Berkeley DeepDrive Drone dataset. The dataset contains 1) a set
of aerial videos recording understructured driving, 2) a collection of images
and annotations to train vehicle detection models, and 3) a kit of development
scripts for illustrating typical usages. We believe that the dataset is of
primary interest for studying decentralized multiagent planning employed by
human drivers and, of secondary interest, for computer vision in remote sensing
settings.Comment: 6 pages, 10 figures, 1 tabl
NOIR: Neural Signal Operated Intelligent Robots for Everyday Activities
We present Neural Signal Operated Intelligent Robots (NOIR), a
general-purpose, intelligent brain-robot interface system that enables humans
to command robots to perform everyday activities through brain signals. Through
this interface, humans communicate their intended objects of interest and
actions to the robots using electroencephalography (EEG). Our novel system
demonstrates success in an expansive array of 20 challenging, everyday
household activities, including cooking, cleaning, personal care, and
entertainment. The effectiveness of the system is improved by its synergistic
integration of robot learning algorithms, allowing for NOIR to adapt to
individual users and predict their intentions. Our work enhances the way humans
interact with robots, replacing traditional channels of interaction with
direct, neural communication. Project website: https://noir-corl.github.io/
Determining crystal structures through crowdsourcing and coursework
We show here that computer game players can build high-quality crystal structures. Introduction of a new feature into the computer game Foldit allows players to build and real-space refine structures into electron density maps. To assess the usefulness of this feature, we held a crystallographic model-building competition between trained crystallographers, undergraduate students, Foldit players and automatic model-building algorithms. After removal of disordered residues, a team of Foldit players achieved the most accurate structure. Analysing the target protein of the competition, YPL067C, uncovered a new family of histidine triad proteins apparently involved in the prevention of amyloid toxicity. From this study, we conclude that crystallographers can utilize crowdsourcing to interpret electron density information and to produce structure solutions of the highest quality